Change Detection and Maintenance of an XML Web Warehouse

نویسنده

  • Ching-Ming Chao
چکیده

The World Wide Web contains a huge and increasing volume of information. The web warehouse is an efficient and effective means to facilitate utilization of information on the Web, not only to individual users but also to business organizations, especially for decision-making purposes. On the other hand, XML has recently become the new standard for representation and exchange of data on the Web. In this paper, therefore, we study the XML web warehouse and propose an approach to the problems of change detection and warehouse maintenance of an XML web warehouse. This paper has three major contributions. First, we propose an object-oriented data model for XML web pages in the web warehouse as well as system architecture for change detection and warehouse maintenance. Second, we propose a change detection method based on mobile agent technology to actively detect changes of data sources of the web warehouse. Third, we propose an incremental and deferred maintenance method to maintain XML web pages in the web warehouse. We compare by experiments our approach with a rewriting approach to storage and maintenance of the XML web warehouse. Performance evaluation shows that our approach is more efficient than the rewriting approach in terms of the response time and storage space of the web warehouse.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Change-Centric Management of Versions in an XML Warehouse

We present a change-centric method to manage versions in a Web WareHouse of XML data. The starting points is a sequence of snapshots of XML documents we obtain from the web. By running a diff algorithm, we compute the changes between two consecutive versions. We then represent the sequence using a novel representation of changes based on completed deltas and persistent identifiers. We present t...

متن کامل

Xyleme: A Dynamic Warehouse for XML Data of the Web

Xyleme is a dynamic warehouse for XML data of the Web supporting query evaluation, change control and data integration. We briefly present our motivations, the general architecture and some aspects of Xyleme. The project we describe here was completed at the end of 2000. A prototype has been implemented. This prototype is now being turned into a product by a start-up company also called Xyleme ...

متن کامل

Acquiring XML pages for a WebHouse

Xyleme is a dynamic warehouse for the XML data of the web supporting change control and data integration. Major issues are the acquisition of XML data and keeping data up to date with the web as best as possible. This is the topic of the present paper.

متن کامل

An Approach for Generating an XML Data Warehouse Schema using Model Transformation Language

Traditionally, the multidimensional schema of the data warehouse is derived from data sources that are mainly the company’s internal data, well-known and structured, by identifying facts, dimensions and numeric measurements through a manual analysis of the operational schemas. With the proliferation of new platforms of communication in today’s information societies, there has been growing numbe...

متن کامل

Integrating Data Warehouses with Web Data for Olap Using Semantic Data Clustering Techniques

Nowadays, Information retrieval plays an important role in the web. Many researches presented techniques for information retrieval process from databases. The previous work presented extended tree pattern clustering process for XML massive storages. This paper presents a new technique termed semantic data clustering (SDC) technique for combining the Data warehouse and web data for OLAP by retri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005